Sparseness and speech pe
نویسنده
چکیده
Can we model speech recognition in noise by exploring higher order statistics of the combined signal? How will changes in these statistics affect speech perception in noise? This study addresses these questions in two experiments. One investigated the relationship between an established ”glimpsing” model and the fourth order statistic, kurtosis. The glimpsing model [1] proposes that listeners can explore the local speech-to-noise ratio (SNR) in short time segments (glimpses) and focus on areas where SNR is high. Results showed that there is a very high correlation between percentages of glimpsing area and kurtosis (r = 0.99; p < 0.01), suggesting that kurtosis can serve as a simpler index for measuring glimpsing. The experiment also examined the association between kurtosis and recognition of nonsense words (vowel-consonantvowel, VCV) in babble modulated noise, also showing very high correlation (r = 0.97; p < 0.01). Another separate study focused on the relationship of sparseness to speech recognition score for VCV words in natural babble noise made of 100 people talking simultaneously [2]. Results show that there is also high correlation between kurtosis and speech recognition score with this noise. Logistic regression analysis to obtain the kurtosis for 50% correct showed this was achieved at a kurtosis of approximately 1.0.
منابع مشابه
An analysis of sparseness and regularization in exemplar-based methods for speech classification
The use of exemplar-based techniques for both speech classification and recognition tasks has become increasingly popular in recent years. However, the notion of why sparseness is important for exemplar-based speech processing has been relatively unexplored. In addition, little analysis has been done in speech processing on the appropriateness of different types of sparsity regularization const...
متن کاملA Noise Estimation Method Based on Speech Presence Probability and Spectral Sparseness
This paper addresses the problem of noise power spectrum estimation. Existing noise estimation methods cannot perform quite reliably when noise level increasing abruptly (e.g., narrowband noise bursts). To overcome this problem, we improve the time-recursive averaging algorithm based on speech presence probability (SPP), by exploiting the sparseness of speech spectrum. Firstly, we utilize the S...
متن کاملBlind Separation of More Speech than Sensors with Less Distortion by Combining Sparseness and ICA
We propose a method for separating speech signals with little distortion when the signals outnumber the sensors. Several methods have already been proposed for solving the underdetermined problem, and some of these utilize the sparseness of speech signals. These methods employ binary masks that extract a signal at time points where the number of active sources is estimated to be only one. Howev...
متن کاملSpeech recognition based on Itakura-Saito divergence and dynamics/sparseness constraints from mixed sound of speech and music by non-negative matrix factorization
We considered a speech recognition method for mixed sound, which is composed of both speech and music, that only removes music based on non-negative matrix factorization (NMF). We used Itakura-Saito divergence instead of Kullback-Leibler divergence to compare the cost function, and the dynamics and sparseness constraints of a weight matrix to improve speech recognition. For isolated word recogn...
متن کاملAn Overview of Data-Driven Part-of-Speech Tagging
Over the last twenty years or so, the approaches to partof-speech tagging based on machine learning techniques have been developed or ported to provide high-accuracy morpho-lexical annotation for an increasing number of languages. Given the large number of morpho-lexical descriptors for a morphologically complex language, one has to consider ways to avoid the data sparseness threat in standard ...
متن کامل